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TECHNICAL FIELD 

15 The present invention relates generally to the detection and therapy of 

breast cancer. The invention is more specifically related to nucleotide sequences that are 
preferentially expressed in breast tumor tissue and to polypeptides encoded by such 
nucleotide sequences. The nucleotide sequences and polypeptides may be used in 
vaccines and pharmaceutical compositions for the prevention and treatment of breast 

20 cancer. The polypeptides may also be used for the production of compounds, such as 
antibodies, useful for diagnosing and monitoring the progression of breast cancer in a 
patient. 

BACKGROUND OF THE INVENTION 

Breast cancer is a significant health problem for women in the United 
25 States and throughout the world. Although advances have been made in detection and 
treatment of the disease, breast cancer remains the second leading cause of cancer-related 
deaths in women, affecting more than 180,000 women in the United States each year. 
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For women in North America, the life-time odds of getting breast cancer are now one in 
eight. 

No vaccine or other universally successful method for the prevention or 
treatment of breast cancer is currently available. Management of the disease currently 

5 relies on a combination of early diagnosis (through routine breast screening procedures) 
and aggressive treatment, which may include one or more of a variety of treatments such 
as surgery, radiotherapy, chemotherapy and hormone therapy. The course of treatment 
for a particular breast cancer is often selected based on a variety of prognostic 
parameters, including an analysis of specific tumor markers. See, e.g., Porter-Jordan and 

10 Lippman, Breast Cancer 5:73-100 (1994). However, the use of established markers 
often leads to a result that is difficult to interpret, and the high mortality observed in 
breast cancer patients indicates that improvements are needed in the treatment, diagnosis 
and prevention of the disease. 

Accordingly, there is a need in the art for improved methods for therapy 

15 and diagnosis of breast cancer. The present invention fulfills these needs and further 
provides other related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, the subject invention provides compositions and methods 
for the diagnosis and therapy of breast cancer. In one aspect, isolated polynucleotides are 

20 provided, comprising (a) a nucleotide sequence preferentially expressed in breast cancer 
tissue, relative to normal tissue; (b) a variant of such a sequence, as defined below; or (c) 
a nucleotide sequence encoding an epitope of a polypeptide encoded by at least one of 
the above sequences. In one embodiment, the isolated polynucleotide comprises a 
human endogenous retroviral sequence recited in SEQ ID NO:l. In other embodiments, 

25 the isolated polynucleotide comprises a sequence recited in any one of SEQ ID NO: 3- 
26, 28-77, 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 
209-214, 216, 218, 219, 221-240, 243-245, 247, 250, 251, 253, 255, 257-266, 268, 269, 
271-273, 275, 276, 278, 280, 281, 284, 288, 291-298, 301-303, 307, 313, 314, 316 and 
317. 
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In related embodiments, the isolated polynucleotide encodes an epitope of 
a polypeptide, wherein the polypeptide is encoded by a nucleotide sequence that: (a) 
hybridizes to a sequence recited in any one of SEQ ID NO: 1, 3-26, 28-77, 142, 143, 
146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 

5 219, 221-240, 243-245, 247, 250, 251, 253, 255, 257-266, 268, 269, 271-273, 275, 276, 
278, 280, 281, 284, 288, 291-298, 301-303, 307, 313, 314, 316 and 317 under stringent 
conditions; and (b) is at least 80% identical to a sequence recited in any one of SEQ ID 
NO: 1, 3-26, 28-77, 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 
206, 207, 209-214, 216, 218, 219, 221-240, 243-245, 247, 250, 251, 253, 255, 257-266, 

10 268, 269, 271-273, 275, 276, 278, 280, 281, 284, 288, 291-298, 301-303, 307, 313, 314, 
316 and 317. 

In another embodiment, the present invention provides an isolated 
polynucleotide encoding an epitope of a polypeptide, the polypeptide being encoded by: 
(a) a nucleotide sequence transcribed from the sequence of SEQ ID NO: 141; or (b) a 
15 variant of said nucleotide sequence that contains one or more nucleotide substitutions, 
deletions, insertions and/or modifications at no more than 20% of the nucleotide 
positions, such that the antigenic and/or immunogenic properties of the polypeptide 
encoded by the nucleotide sequence are retained. Isolated DNA and RNA molecules 
comprising a nucleotide sequence complementary to a polynucleotide as described above 

20 are also provided. 

In related aspects, the present invention provides recombinant expression 
vectors comprising a polynucleotide as described above and host cells transformed or 
transfected with such expression vectors. 

In further aspects, polypeptides comprising an amino acid sequence 
25 encoded by a polynucleotide as described above, and monoclonal antibodies that bind to 
such polypeptides are provided. In certain embodiments, the inventive polypeptides 
comprise an amino acid sequence selected from the group consisting of SEQ ID NO: 
299, 300, 304-306, 308 and 315, and variants thereof as defined below. 

In yet another aspect, methods are provided for determining the presence 
30 of breast cancer in a patient. In one embodiment, the method comprises detecting, within 
a biological sample, a polypeptide as described above. In another embodiment, the 
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method comprises detecting, within a biological sample, an RNA molecule encoding a 
polypeptide as described above. In yet another embodiment, the method comprises (a) 
intradermally injecting a patient with a polypeptide as described above; and (b) detecting 
an immune response on the patient's skin and therefrom detecting the presence of breast 

5 cancer in the patient. In further embodiments, the present invention provides methods 
for determining the presence of breast cancer in a patient as described above wherein the 
polypeptide is encoded by a nucleotide sequence selected from the group consisting of 
SEQ ID NO: 78-86, 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, 241, 
242, 246, 248, 249, 252, 256, 267, 270, 274, 277, 279, 282, 283, 285-287, 289, 290 and 

10 sequences that hybridize thereto under stringent conditions. 

In a related aspect, diagnostic kits useful in the determination of breast 
cancer are provided. The diagnostic kits generally comprise either one or more 
monoclonal antibodies as described above, or one or more monoclonal antibodies that 
bind to a polypeptide encoded by a nucleotide sequence selected from the group 

15 consisting of sequences provided in SEQ ID NO: 78-86, 144, 145, 153, 167, 177, 193, 
199, 205, 208, 215, 217, 220, 241, 242 and 246, 248, 249, 252, 256, 267, 270, 274, 277, 
279, 282, 283, 285-287, 289, 290 and a detection reagent. 

Diagnostic kits are also provided that comprise a first polymerase chain 
reaction primer and a second polymerase chain reaction primer, at least one of the 

20 primers being specific for a polynucleotide described herein. In one embodiment, at 
least one of the primers comprises at least about 10 contiguous nucleotides of a 
polynucleotide as described above, or a polynucleotide encoding a polypeptide encoded 
by a sequence selected from the group consisting of SEQ ID NO: 78-86, 144, 145, 153, 
167, 177, 193, 199, 205, 208, 215, 217, 220, 241, 242 246, 248, 249, 252, 256, 267, 270, 

25 274, 277, 279, 282, 283, 285-287, 289 and 290. 

Within another related aspect, the diagnostic kit comprises at least one 
oligonucleotide probe, the probe being specific for a polynucleotide described herein. In 
one embodiment, the probe comprises at least about 15 contiguous nucleotides of a 
polynucleotide as described above, or a polynucleotide selected from the group 

30 consisting of SEQ ID NO: 78-86, 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 
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220, 241, 242 246, 248, 249, 252, 256, 267, 270, 274, 277, 279, 282, 283, 285-287, 289 
and 290. 

In another related aspect, the present invention provides methods for 
monitoring the progression of breast cancer in a patient. In one embodiment, the method 

5 comprises: (a) detecting an amount, in a biological sample, of a polypeptide as described 
above at a first point in time; (b) repeating step (a) at a subsequent point in time; and (c) 
comparing the amounts of polypeptide detected in steps (a) and (b), and therefrom 
monitoring the progression of breast cancer in the patient. In another embodiment, the 
method comprises (a) detecting an amount, within a biological sample, of an RNA 

10 molecule encoding a polypeptide as described above at a first point in time; (b) repeating 
step (a) at a subsequent point in time; and (c) comparing the amounts of RNA molecules 
detected in steps (a) and (b), and therefrom monitoring the progression of breast cancer 
in the patient. In yet other embodiments, the present invention provides methods for 
monitoring the progression of breast cancer in a patient as described above wherein the 

15 polypeptide is encoded by a nucleotide sequence selected from the group consisting of 
SEQ ID NO: 78-86, 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, 241, 
242, 246, 248, 249, 252, 256, 267, 270, 274, 277, 279, 282, 283, 285-287, 289, 290 and 
sequences that hybridize thereto under stringent conditions. 

In still other aspects, pharmaceutical compositions, which comprise a 

20 polypeptide as described above in combination with a physiologically acceptable carrier, 
and vaccines, which comprise a polypeptide as described above in combination with an 
immunostimulant or adjuvant, are provided. In yet other aspects, the present invention 
provides pharmaceutical compositions and vaccines comprising a polypeptide encoded 
by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86, 144, 

25 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, 241, 242 and 246, 248, 249, 252, 
256, 267, 270, 274, 277, 279, 282, 283, 285-287, 289, 290 and sequences that hybridize 
thereto under stringent conditions. 

In related aspects, the present invention provides methods for inhibiting 
the development of breast cancer in a patient, comprising administering to a patient a 

30 pharmaceutical composition or vaccine as described above. 



These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. All 
references disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the differential display PCR products, separated by gel 
electrophoresis, obtained from cDNA prepared from normal breast tissue (lanes 1 and 2) 
and from cDNA prepared from breast tumor tissue from the same patient (lanes 3 and 4). 
The arrow indicates the band corresponding to B18Agl. 

Figure 2 is a northern blot comparing the level of B18Agl mRNA in 
breast tumor tissue (lane 1) with the level in normal breast tissue. 

Figure 3 shows the level of B18Agl mRNA in breast tumor tissue 
compared to that in various normal and non-breast tumor tissues as determined by RNase 
protection assays. 

Figure 4 is a genomic clone map showing the location of additional 
retroviral sequences obtained from ends of Xbal restriction digests (provided in SEQ ID 
NO:3 - SEQ ID NO:10) relative to B18Agl. 

Figures 5A and 5B show the sequencing strategy, genomic organization 
and predicted open reading frame for the retroviral element containing B18Agl. 

Figure 6 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B18Agl. 

Figure 7 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B17Agl. 

Figure 8 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B17Ag2. 

Figure 9 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B13Ag2a. 

Figure 10 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B13Aglb. 



Figure 11 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B13Agla. 

Figure 12 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA BllAgl. 

5 Figure 13 shows the nucleotide sequence of the representative breast 

tumor-specific cDNA B3CA3c. 

Figure 14 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B9CG1. 

Figure 15 shows the nucleotide sequence of the representative breast 
10 tumor-specific cDNA B9CG3. 

Figure 16 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B2CA2. 

Figure 17 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B3CA1. 

15 Figure 18 shows the nucleotide sequence of the representative breast 

tumor-specific cDNA B3CA2. 

Figure 19 shows the nucleotide sequence of the representative breast 
tumor-specific cDNA B3CA3. 

Figure 20 shows the nucleotide sequence of the representative breast 
20 tumor-specific cDNA B4CA1 . 

Figure 21 A depicts RT-PCR analysis of breast tumor genes in breast 
tumor tissues (lanes 1-8) and normal breast tissues (lanes 9-13) and H 2 0 (lane 14). 

Figure 21B depicts RT-PCR analysis of breast tumor genes in prostate 
tumors (lane 1, 2), colon tumors (lane 3), lung tumor (lane 4), normal prostate (lane 5), 
25 normal colon (lane 6), normal kidney (lane 7), normal liver (lane 8), normal lung (lane 
9), normal ovary (lanes 10, 18), normal pancreases (lanes 11,12), normal skeletal muscle 
(lane 13), normal skin (lane 14), normal stomach (lane 15), normal testes (lane 16), 
normal small intestine (lane 17), HBL-100 (lane 19), MCF-12A (lane 20), breast tumors 
(lanes 21-23), H 2 0 (lane 24), and colon tumor (lane 25). 
30 Figure 22 shows the recognition of a Bl 1 Agl peptide (referred to as Bl 1- 

8) by an anti-Bl 1-8 CTL line. 



8 

Figure 23 shows the recognition of a cell line transduced with the antigen 
Bl lAgl by the Bl 1-8 specific clone Al. 

Figure 24 shows recognition of a lung adenocarcinoma line (LT- 140-22) 
and a breast adenocarcinoma line (CAMA-1) by the Bl 1-8 specific clone Al. 

5 DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to 
compositions and methods for the diagnosis, monitoring and therapy of breast cancer. 
The compositions described herein include polypeptides, polynucleotides and antibodies. 
Polypeptides of the present invention generally comprise at least a portion of a protein 

10 that is expressed at a greater level in human breast tumor tissue than in normal breast 
tissue (i.e., the level of RNA encoding the polypeptide is at least 2-fold higher in tumor 
tissue). Such polypeptides are referred to herein as breast tumor-specific polypeptides, 
and cDNA molecules encoding such polypeptides are referred to as breast tumor-specific 
cDNAs. Polynucleotides of the subject invention generally comprise a DNA or RNA 

15 sequence that encodes all or a portion of a polypeptide as described above, or that is 
complementary to such a sequence. Antibodies are generally immune system proteins, 
or fragments thereof, that are capable of binding to a portion of a polypeptide as 
described above. Antibodies can be produced by cell culture techniques, including the 
generation of monoclonal antibodies as described herein, or via transfection of antibody 

20 genes into suitable bacterial or mammalian cell hosts, in order to allow for the production 

of recombinant antibodies. 

Polypeptides within the scope of this invention include, but are not 

limited to, polypeptides (and epitopes thereof) encoded by a human endogenous 

retroviral sequence, such as the sequence designated B18Agl (Figure 5 and SEQ ID 

25 NO:l). Also within the scope of the present invention are polypeptides encoded by other 
sequences within the retroviral genome containing B18Agl (SEQ ID NO: 141). Such 
sequences include, but are not limited to, the sequences recited in SEQ ID NO: 3 - SEQ 
ID NO:10. B18Agl has homology to the gag p30 gene of the endogenous human 
retroviral element S71, as described in Werner et al, Virology 774:225-238 (1990) and 

30 also shows homology to about thirty other retroviral gag genes. As discussed in more 
detail below, the present invention also includes a number of additional breast tumor- 
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specific polypeptides, such as those encoded by the nucleotide sequences recited in SEQ 
ID NO: 11-26,28-77, 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 
206, 207, 209-214, 216, 218, 219, 221-240, 243-245, 247, 250, 251, 253, 255, 257-266, 
268, 269, 271-273, 275, 276, 278, 280, 281, 284, 288, 291-298, 301-303, 307, 313, 314, 
5 316 and 317. 

As used herein, the term "polypeptide" encompasses amino acid chains of 
any length, including full length proteins containing the sequences recited herein. A 
polypeptide comprising an epitope of a protein containing a sequence as described herein 
may consist entirely of the epitope, or may contain additional sequences. The additional 

10 sequences may be derived from the native protein or may be heterologous, and such 
sequences may (but need not) possess immunogenic or antigenic properties. 

An "epitope," as used herein is a portion of a polypeptide that is 
recognized (i.e., specifically bound) by a B-cell and/or T-cell surface antigen receptor. 
Epitopes may generally be identified using well known techniques, such as those 

15 summarized in Paul, Fundamental Immunology, 3rd ed., 243-247 (Raven Press, 1993) 
and references cited therein. Such techniques include screening polypeptides derived 
from the native polypeptide for the ability to react with antigen-specific antisera and/or 
T-cell lines or clones. An epitope of a polypeptide is a portion that reacts with such 
antisera and/or T-cells at a level that is similar to the reactivity of the full length 

20 polypeptide (e.g., in an ELISA and/or T-cell reactivity assay). Such screens may 
generally be performed using methods well known to those of ordinary skill in the art, 
such as those described in Harlow and Lane, Antibodies: A Laboratory Manual, Cold 
Spring Harbor Laboratory, 1988. B-cell and T-cell epitopes may also be predicted via 
computer analysis. Polypeptides comprising an epitope of a polypeptide that is 

25 preferentially expressed in a tumor tissue (with or without additional amino acid 
sequence) are within the scope of the present invention. 

The term "polynucleotide(s)," as used herein, means a single or double- 
stranded polymer of deoxyribonucleotide or ribonucleotide bases and includes DNA and 
corresponding RNA molecules, including HnRNA and mRNA molecules, both sense and 

30 anti-sense strands, and comprehends cDNA, genomic DNA and recombinant DNA, as 
well as wholly or partially synthesized polynucleotides. An HnRNA molecule contains 
introns and corresponds to a DNA molecule in a generally one-to-one manner. An 
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mRNA molecule corresponds to an HnRNA and DNA molecule from which the introns 
have been excised. A polynucleotide may consist of an entire gene, or any portion 
thereof. Operable anti-sense polynucleotides may comprise a fragment of the 
corresponding polynucleotide, and the definition of "polynucleotide" therefore includes 
5 all such operable anti-sense fragments. 

The compositions and methods of the present invention also encompass 
variants of the above polypeptides and polynucleotides. 

A polypeptide "variant," as used herein, is a polypeptide that differs from 
the recited polypeptide only in conservative substitutions and/or modifications, such that 

10 the antigenic properties of the polypeptide are retained. In a preferred embodiment, 
variant polypeptides differ from an identified sequence by substitution, deletion or 
addition of five amino acids or fewer. Such variants may generally be identified by 
modifying one of the above polypeptide sequences, and evaluating the antigenic 
properties of the modified polypeptide using, for example, the representative procedures 

15 described herein. Polypeptide variants preferably exhibit at least about 70%, more 
preferably at least about 90% and most preferably at least about 95% identity 
(determined as described below) to the identified polypeptides. 

As used herein, a "conservative substitution" is one in which an amino 
acid is substituted for another amino acid that has similar properties, such that one skilled 

20 in the art of peptide chemistry would expect the secondary structure and hydropathic 
nature of the polypeptide to be substantially unchanged. In general, the following groups 
of amino acids represent conservative changes: (1) ala, pro, gly, glu, asp, gin, asn, ser, 
thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, 
tip, his. 

25 Variants may also, or alternatively, contain other modifications, including 

the deletion or addition of amino acids that have minimal influence on the antigenic 
properties, secondary structure and hydropathic nature of the polypeptide. For example, 
a polypeptide may be conjugated to a signal (or leader) sequence at the N-terminal end of 
the protein which co-translationally or post-translationally directs transfer of the protein. 

30 The polypeptide may also be conjugated to a linker or other sequence for ease of 
synthesis, purification or identification of the polypeptide (e.g., poly-His), or to enhance 
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binding of the polypeptide to a solid support. For example, a polypeptide may be 
conjugated to an immunoglobulin Fc region. 

A nucleotide "variant" is a sequence that differs from the recited 
nucleotide sequence in having one or more nucleotide deletions, substitutions or 
5 additions. Such modifications may be readily introduced using standard mutagenesis 
techniques, such as oligonucleotide-directed site-specific mutagenesis as taught, for 
example, by Adelman et al. {DNA, 2:183, 1983). Nucleotide variants may be naturally 
occurring allelic variants, or non-naturally occurring variants. Variant nucleotide 
sequences preferably exhibit at least about 70%, more preferably at least about 80% and 
10 most preferably at least about 90% identity (determined as described below) to the 
recited sequence. 

The breast tumor antigens provided by the present invention include 
variants that are encoded by DNA sequences which are substantially homologous to one 
or more of the DNA sequences specifically recited herein. "Substantial homology," as 

15 used herein, refers to DNA sequences that are capable of hybridizing under moderately 
stringent conditions. Suitable moderately stringent conditions include prewashing in a 
solution of 5X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing at 50°C-65°C, 5X 
SSC, overnight or, in the event of cross-species homology, at 45°C with 0.5X SSC; 
followed by washing twice at 65°C for 20 minutes with each of 2X, 0.5X and 0.2X SSC 

20 containing 0.1% SDS. Such hybridizing DNA sequences are also within the scope of 
this invention, as are nucleotide sequences that, due to code degeneracy, encode an 
immunogenic polypeptide that is encoded by a hybridizing DNA sequence. 

Two nucleotide or polypeptide sequences are said to be "identical" if the 
sequence of nucleotides or amino acid residues in the two sequences is the same when 

25 aligned for maximum correspondence as described below. Comparisons between two 
sequences are typically performed by comparing the sequences over a comparison 
window to identify and compare local regions of sequence similarity. A "comparison 
window" as used herein, refers to a segment of at least about 20 contiguous positions, 
usually 30 to about 75, 40 to about 50, in which a sequence may be compared to a 

30 reference sequence of the same number of contiguous positions after the two sequences 
are optimally aligned. 
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Optimal alignment of sequences for comparison may be conducted using 
the Megalign program in the Lasergene suite of bioinformatics software (DNASTAR, 
Inc., Madison, WI), using default parameters. This program embodies several alignment 
schemes described in the following references: Dayhoff, M.O. (1978) A model of 

5 evolutionary change in proteins - Matrices for detecting distant relationships. In 
Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical 
Resarch Foundaiton, Washington DC Vol 5, Suppl. 3, pp. 345-358; Hein J. (1990) 
Unified Approach to Alignment and Phylogenes pp. 626-645 Methods in Enzymology 
vol 183, Academic Press, Inc., San Diego, CA; Higgins, D.G. and Sharp, P.M. (1989) 

10 Fast and sensitive multiple sequence alignments on a microcomputer CABIOS 5:151- 
153; Myers, E.W. and Muller W. (1988) Optimal alignments in linear space CABIOS 
4:11-17; Robinson, E.D. (1971) Comb. Theor 77:105; Santou, N. Nes, M. (1987) The 
neighbor joining method. A new method for reconstructing phylogenetic trees Mol Biol 
Evol 4:406-425; Sneath, P.H.A. and Sokal, R.R. (1973) Numerical Taxonomy - the 

15 Principles and Practice of Numerical Taxonomy, Freeman Press, San Francisco, CA; 
Wilbur, W.J. and Lipman, D.J. (1983) Rapid similarity searches of nucleic acid and 
protein data banks Proc. Natl Acad., Scl USA 50:726-730. 

Preferably, the "percentage of sequence identity" is 
determined by comparing two optimally aligned sequences over a window of comparison 

20 of at least 20 positions, wherein the portion of the polynucleotide sequence in the 
comparison window may comprise additions or deletions (i.e. gaps) of 20 percent or less, 
usually 5 to 15 percent, or 10 to 12 percent, as compared to the reference sequences 
(which does not comprise additions or deletions) for optimal alignment of the two 
sequences. The percentage is calculated by determining the number of positions at which 

25 the identical nucleic acid bases or amino acid residue occurs in both sequences to yield 
the number of matched positions, dividing the number of matched positions by the total 
number of positions in the reference sequence (i.e. the window size) and multiplying the 
results by 100 to yield the percentage of sequence identity. In general, polynucleotides 
encoding all or a portion of the polypeptides described herein may be prepared using any 

30 of several techniques. For example, cDNA molecules encoding such polypeptides may 
be cloned on the basis of the breast tumor-specific expression of the corresponding 
mRNAs, using differential display PCR. This technique compares the amplified 
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products from RNA template prepared from normal and breast tumor tissue. cDNA may 
be prepared by reverse transcription of RNA using a (dT) 12 AG primer. Following 
amplification of the cDNA using a random primer, a band corresponding to an amplified 
product specific to the tumor RNA may be cut out from a silver stained gel and 
5 subcloned into a suitable vector (e.g., the T-vector, Novagen, Madison, WI). 
Polynucleotides encoding all or a portion of the breast tumor-specific polypeptides 
disclosed herein may be amplified from cDNA prepared as described above using the 
random primers shown in SEQ ID NO.:87-125. 

Alternatively, a polynucleotide encoding a polypeptide as described 
10 herein (or a portion thereof) may be amplified from human genomic DNA, or from breast 
tumor cDNA, via polymerase chain reaction. For this approach, B18Agl sequence- 
specific primers may be designed based on the sequence provided in SEQ ID NO:l, and 
may be purchased or synthesized. One suitable primer pair for amplification from breast 
tumor cDNA is (5'ATG GCT ATT TTC GGG GGC TGA CA) (SEQ ID NO: 126) and 
15 (5'CCG GTA TCT CCT CGT GGG TAT T) (SEQ ID NO: 127). An amplified portion of 
B18Agl may then be used to isolate the full length gene from a human genomic DNA 
library or from a breast tumor cDNA library, using well known techniques, such as those 
described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring 
Harbor Laboratories, Cold Spring Harbor, NY (1989). Other sequences within the 
20 retroviral genome of which B18Agl is a part may be similarly prepared by screening 
human genomic libraries using B18Agl -specific sequences as probes. Nucleotides 
translated into protein from the retroviral genome shown in SEQ ID NO: 141 may then 
be determined by cloning the corresponding cDNAs, predicting the open reading frames 
and cloning the appropriate cDNAs into a vector containing a viral promoter, such as T7. 
25 The resulting constructs can be employed in a translation reaction, using techniques 
known to those of skill in the art, to identify nucleotide sequences which result in 
expressed protein. Similarly, primers specific for the remaining breast tumor-specific 
polypeptides described herein may be designed based on the nucleotide sequences 
provided in SEQ ID NO: 11-86, 142-298, 301-303, 307, 313, 314, 316 and 317. 
30 Recombinant polypeptides encoded by the DNA sequences described 

above may be readily prepared from the DNA sequences. For example, supernatants 
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from suitable host/vector systems which secrete recombinant protein or polypeptide into 
culture media may be first concentrated using a commercially available filter. Following 
concentration, the concentrate may be applied to a suitable purification matrix such as an 
affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps 
5 can be employed to further purify a recombinant polypeptide. 

In general, any of a variety of expression vectors known to those of 
ordinary skill in the art may be employed to express recombinant polypeptides of this 
invention. Expression may be achieved in any appropriate host cell that has been 
transformed or transfected with an expression vector containing a polynucleotide that 

10 encodes a recombinant polypeptide. Suitable host cells include prokaryotes, yeast and 
higher eukaryotic cells. Preferably, the host cells employed are E. coli, yeast or a 
mammalian cell line such as COS or CHO. 

Such techniques may also be used to prepare polypeptides comprising 
epitopes or variants of the native polypeptides. For example, variants of a native 

15 polypeptide may generally be prepared using standard mutagenesis techniques, such as 
oligonucleotide-directed site-specific mutagenesis, and sections of the DNA sequence 
may be removed to permit preparation of truncated polypeptides. Portions and other 
variants having fewer than about 100 amino acids, and generally fewer than about 50 
amino acids, may also be generated by synthetic means, using techniques well known to 

20 those of ordinary skill in the art. For example, such polypeptides may be synthesized 
using any of the commercially available solid-phase techniques, such as the Merrifield 
solid-phase synthesis method, where amino acids are sequentially added to a growing 
amino acid chain. See Merrifield, J. Am. Chem. Soc. 55:2149-2146 (1963). Equipment 
for automated synthesis of polypeptides is commercially available from suppliers such as 

25 Perkin Elmer/Applied BioSystems Division,, Foster City, CA, and may be operated 
according to the manufacturer's instructions. 

In specific embodiments, polypeptides of the present invention encompass 
amino acid sequences encoded by a polynucleotide having a sequence recited in any one 
of SEQ ID NO:l,3-26, 28-77, 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 

30 200-204, 206, 207, 209-214, 216, 218, 219, 221-240, 243-245, 247, 250, 251, 253, 255, 
257-266, 268, 269, 271-273, 275, 276, 278, 280, 281, 284, 288, 291-298, 301-303, 307, 
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313, 314, 316 and 317, and variants of such polypeptides. Polypeptides within the scope 
of the present invention also include polypeptides (and epitopes thereof) encoded by 
DNA sequences that hybridize to a sequence recited in any one of SEQ ID NO:l, 3-26, 
28-77, 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209- 
214, 216, 218, 219, 221-240, 243-245, 247, 250, 251, 253, 255, 257-266, 268, 269, 271- 
273, 275, 276, 278, 280, 281, 284, 288, 291-298, 301-303, 307, 313, 314, 316 and 317 
under stringent conditions, wherein the DNA sequences are at least 80% identical in 
overall sequence to a recited sequence and wherein RNA corresponding to the nucleotide 
sequence is expressed at a greater level in human breast tumor tissue than in normal 
breast tissue. As used herein, "stringent conditions" refers to prewashing in a solution of 
6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by 
two washes of 30 minutes each in IX SSC, 0.1% SDS at 65°C and two washes of 30 
minutes each in 0.2 X SSC, 0.1% SDS at 65°C. Polynucleotides according to the present 
invention include molecules that encode any of the above polypeptides. 

In another aspect of the present invention, antibodies are provided. Such 
antibodies may be prepared by any of a variety of techniques known to those of ordinary 
skill in the art. See, e.g., Harlow and Lane, Antibodies: A Laboratory Manual, Cold 
Spring Harbor Laboratory, 1988. In one such technique, an immunogen comprising the 
polypeptide is initially injected into any of a wide variety of mammals (e.g., mice, rats, 
rabbits, sheep or goats). In this step, the polypeptides of this invention may serve as the 
immunogen without modification. Alternatively, particularly for relatively short 
polypeptides, a superior immune response may be elicited if the polypeptide is joined to 
a carrier protein, such as bovine serum albumin or keyhole limpet hemocyanin. The 
immunogen is injected into the animal host, preferably according to a predetermined 
schedule incorporating one or more booster immunizations, and the animals are bled 
periodically. Polyclonal antibodies specific for the polypeptide may then be purified 
from such antisera by, for example, affinity chromatography using the polypeptide 
coupled to a suitable solid support. 

Monoclonal antibodies specific for the antigenic polypeptide of interest 
may be prepared, for example, using the technique of Kohler and Milstein, Eur. J. 
Immunol. (5:511-519 (1976), and improvements thereto. Briefly, these methods involve 
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the preparation of immortal cell lines capable of producing antibodies having the desired 
specificity (i.e., reactivity with the polypeptide of interest). Such cell lines may be 
produced, for example, from spleen cells obtained from an animal immunized as 
described above. The spleen cells are then immortalized by, for example, fusion with a 
myeloma cell fusion partner, preferably one that is syngeneic with the immunized 
animal. A variety of fusion techniques may be employed. For example, the spleen cells 
and myeloma cells may be combined with a nonionic detergent for a few minutes and 
then plated at low density on a selective medium that supports the growth of hybrid cells, 
but not myeloma cells. A preferred selection technique uses HAT (hypoxanthine, 
aminopterin, thymidine) selection. After a sufficient time, usually about 1 to 2 weeks, 
colonies of hybrids are observed. Single colonies are selected and their culture 
supernatants tested for binding activity against the polypeptide. Hybridomas having high 
reactivity and specificity are preferred. 

Monoclonal antibodies may be isolated from the supernatants of growing 
hybridoma colonies. In addition, various techniques may be employed to enhance the 
yield, such as injection of the hybridoma cell line into the peritoneal cavity of a suitable 
vertebrate host, such as a mouse. Monoclonal antibodies may then be harvested from the 
ascites fluid or the blood. Contaminants may be removed from the antibodies by 
conventional techniques, such as chromatography, gel filtration, precipitation, and 
extraction. The polypeptides of this invention may be used in the purification process in, 
for example, an affinity chromatography step. 

Antibodies may be used, for example, in methods for detecting breast 
cancer in a patient. Such methods involve using an antibody to detect the presence or 
absence of a breast tumor-specific polypeptide as described herein in a suitable biological 
sample. As used herein, suitable biological samples include tumor or normal tissue 
biopsy, mastectomy, blood, lymph node, serum or urine samples, or other tissue, 
homogenate, or extract thereof obtained from a patient. 

There are a variety of assay formats known to those of ordinary skill in 
the art for using an antibody to detect polypeptide markers in a sample. See, e.g., Harlow 
and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988. For 
example, the assay may be performed in a Western blot format, wherein a protein 
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preparation from the biological sample is submitted to gel electrophoresis, transferred to 
a suitable membrane and allowed to react with the antibody. The presence of the 
antibody on the membrane may then be detected using a suitable detection reagent, as 
described below. 

5 In another embodiment, the assay involves the use of antibody 

immobilized on a solid support to bind to the polypeptide and remove it from the 
remainder of the sample. The bound polypeptide may then be detected using a second 
antibody or reagent that contains a reporter group. Alternatively, a competitive assay 
may be utilized, in which a polypeptide is labeled with a reporter group and allowed to 

10 bind to the immobilized antibody after incubation of the antibody with the sample. The 
extent to which components of the sample inhibit the binding of the labeled polypeptide 
to the antibody is indicative of the reactivity of the sample with the immobilized 
antibody, and as a result, indicative of the concentration of polypeptide in the sample. 

The solid support may be any material known to those of ordinary skill in 

15 the art to which the antibody may be attached. For example, the solid support may be a 
test well in a microtiter plate or a nitrocellulose filter or other suitable membrane. 
Alternatively, the support may be a bead or disc, such as glass, fiberglass, latex or a 
plastic material such as polystyrene or polyvinylchloride. The support may also be a 
magnetic particle or a fiber optic sensor, such as those disclosed, for example, in U.S. 

20 Patent No. 5,359,681. 

The antibody may be immobilized on the solid support using a variety of 
techniques known to those in the art, which are amply described in the patent and 
scientific literature. In the context of the present invention, the term "immobilization" 
refers to both noncovalent association, such as adsorption, and covalent attachment 

25 (which may be a direct linkage between the antigen and functional groups on the support 
or may be a linkage by way of a cross-linking agent). Immobilization by adsorption to a 
well in a microtiter plate or to a membrane is preferred. In such cases, adsorption may be 
achieved by contacting the antibody, in a suitable buffer, with the solid support for a 
suitable amount of time. The contact time varies with temperature, but is typically 

30 between about 1 hour and 1 day. In general, contacting a well of a plastic microtiter 
plate (such as polystyrene or polyvinylchloride) with an amount of antibody ranging 
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from about 10 ng to about 1 jag, and preferably about 100-200 ng, is sufficient to 
immobilize an adequate amount of polypeptide. 

Covalent attachment of antibody to a solid support may also generally be 
achieved by first reacting the support with a bifunctional reagent that will react with both 

5 the support and a functional group, such as a hydroxyl or amino group, on the antibody. 
For example, the antibody may be covalently attached to supports having an appropriate 
polymer coating using benzoquinone or by condensation of an aldehyde group on the 
support with an amine and an active hydrogen on the binding partner (see, e.g., Pierce 
Immunotechnology Catalog and Handbook (1991) at A12-A13). 

10 In certain embodiments, the assay for detection of polypeptide in a sample 

is a two-antibody sandwich assay. This assay may be performed by first contacting an 
antibody that has been immobilized on a solid support, commonly the well of a 
microtiter plate, with the biological sample, such that the polypeptide within the sample 
are allowed to bind to the immobilized antibody. Unbound sample is then removed from 

15 the immobilized polypeptide-antibody complexes and a second antibody (containing a 
reporter group) capable of binding to a different site on the polypeptide is added. The 
amount of second antibody that remains bound to the solid support is then determined 
using a method appropriate for the specific reporter group. 

More specifically, once the antibody is immobilized on the support as 

20 described above, the remaining protein binding sites on the support are typically blocked. 
Any suitable blocking agent known to those of ordinary skill in the art, such as bovine 
serum albumin or Tween 20™ (Sigma Chemical Co., St. Louis, MO). The immobilized 
antibody is then incubated with the sample, and polypeptide is allowed to bind to the 
antibody. The sample may be diluted with a suitable diluent, such as phosphate-buffered 

25 saline (PBS) prior to incubation. In general, an appropriate contact time (i.e., incubation 
time) is that period of time that is sufficient to detect the presence of polypeptide within a 
sample obtained from an individual with breast cancer. Preferably, the contact time is 
sufficient to achieve a level of binding that is at least 95% of that achieved at equilibrium 
between bound and unbound polypeptide. Those of ordinary skill in the art will 

30 recognize that the time necessary to achieve equilibrium may be readily determined by 
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assaying the level of binding that occurs over a period of time. At room temperature, an 
incubation time of about 30 minutes is generally sufficient. 

Unbound sample may then be removed by washing the solid support with 
an appropriate buffer, such as PBS containing 0.1% Tween 20™. The second antibody, 

5 which contains a reporter group, may then be added to the solid support. Preferred 
reporter groups include enzymes (such as horseradish peroxidase), substrates, cofactors, 
inhibitors, dyes, radionuclides, luminescent groups, fluorescent groups and biotin. The 
conjugation of antibody to reporter group may be achieved using standard methods 
known to those of ordinary skill in the art. 

10 The second antibody is then incubated with the immobilized antibody- 

polypeptide complex for an amount of time sufficient to detect the bound polypeptide. 
An appropriate amount of time may generally be determined by assaying the level of 
binding that occurs over a period of time. Unbound second antibody is then removed 
and bound second antibody is detected using the reporter group. The method employed 

15 for detecting the reporter group depends upon the nature of the reporter group. For 
radioactive groups, scintillation counting or autoradiographic methods are generally 
appropriate. Spectroscopic methods may be used to detect dyes, luminescent groups and 
fluorescent groups. Biotin may be detected using avidin, coupled to a different reporter 
group (commonly a radioactive or fluorescent group or an enzyme). Enzyme reporter 

20 groups may generally be detected by the addition of substrate (generally for a specific 
period of time), followed by spectroscopic or other analysis of the reaction products. 

To determine the presence or absence of breast cancer, the signal detected 
from the reporter group that remains bound to the solid support is generally compared to 
a signal that corresponds to a predetermined cut-off value established from non-tumor 

25 tissue. In one preferred embodiment, the cut-off value is the average mean signal 
obtained when the immobilized antibody is incubated with samples from patients without 
breast cancer. In general, a sample generating a signal that is three standard deviations 
above the predetermined cut-off value may be considered positive for breast cancer. In 
an alternate preferred embodiment, the cut-off value is determined using a Receiver 

30 Operator Curve, according to the method of Sackett et al., Clinical Epidemiology: A 
Basic Science for Clinical Medicine, p. 106-7 (Little Brown and Co., 1985). Briefly, in 
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this embodiment, the cut-off value may be determined from a plot of pairs of true 
positive rates (i.e., sensitivity) and false positive rates (100%-specificity) that correspond 
to each possible cut-off value for the diagnostic test result. The cut-off value on the plot 
that is the closest to the upper left-hand corner (i.e., the value that encloses the largest 

5 area) is the most accurate cut-off value, and a sample generating a signal that is higher 
than the cut-off value determined by this method may be considered positive. 
Alternatively, the cut-off value may be shifted to the left along the plot, to minimize the 
false positive rate, or to the right, to minimize the false negative rate. In general, a sample 
generating a signal that is higher than the cut-off value determined by this method is 

10 considered positive for breast cancer. 

In a related embodiment, the assay is performed in a flow-through or strip 
test format, wherein the antibody is immobilized on a membrane, such as nitrocellulose. 
In the flow-through test, the polypeptide within the sample bind to the immobilized 
antibody as the sample passes through the membrane. A second, labeled antibody then 

15 binds to the antibody-polypeptide complex as a solution containing the second antibody 
flows through the membrane. The detection of bound second antibody may then be 
performed as described above. In the strip test format, one end of the membrane to 
which antibody is bound is immersed in a solution containing the sample. The sample 
migrates along the membrane through a region containing second antibody and to the 

20 area of immobilized antibody. Concentration of second antibody at the area of 
immobilized antibody indicates the presence of breast cancer. Typically, the 
concentration of second antibody at that site generates a pattern, such as a line, that can 
be read visually. The absence of such a pattern indicates a negative result. In general, 
the amount of antibody immobilized on the membrane is selected to generate a visually 

25 discernible pattern when the biological sample contains a level of polypeptide that would 
be sufficient to generate a positive signal in the two-antibody sandwich assay, in the 
format discussed above. Preferably, the amount of antibody immobilized on the 
membrane ranges from about 25 ng to about 1 ng, and more preferably from about 50 ng 
to about 1 |ig. Such tests can typically be performed with a very small amount of 

30 biological sample. 
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The presence or absence of breast cancer in a patient may also be 
determined by evaluating the level of mRNA encoding a breast tumor-specific 
polypeptide as described herein within the biological sample (e.g., a biopsy, mastectomy 
and/or blood sample from a patient) relative to a predetermined cut-off value. Such an 

5 evaluation may be achieved using any of a variety of methods known to those of ordinary 
skill in the art such as, for example, in situ hybridization and amplification by 
polymerase chain reaction. 

For example, polymerase chain reaction may be used to amplify 
sequences from cDNA prepared from RNA that is isolated from one of the above 

10 biological samples. Sequence-specific primers for use in such amplification may be 
designed based on the sequences provided in any one of SEQ ID NO: 1, 1 1-86, 142-298 
301-303, 307, 313, 314, 316 and 317, and may be purchased or synthesized. In the case 
of B18Agl, as noted herein, one suitable primer pair is B18Agl-2 (5'ATG GCT ATT 
TTC GGG GGC TGA CA) (SEQ ID NO:126) and B18Agl-3 (5'CCG GTA TCT CCT 

15 CGT GGG TAT T) (SEQ ID NO: 127). The PCR reaction products may then be 
separated by gel electrophoresis and visualized according to methods well known to 
those of ordinary skill in the art. Amplification is typically performed on samples 
obtained from matched pairs of tissue (tumor and non-tumor tissue from the same 
individual) or from unmatched pairs of tissue (tumor and non-tumor tissue from different 

20 individuals). The amplification reaction is preferably performed on several dilutions of 
cDNA spanning two orders of magnitude. A two-fold or greater increase in expression 
in several dilutions of the tumor sample as compared to the same dilution of the non- 
tumor sample is considered positive. 

As used herein, the term "primer/probe specific for a polynucleotide" 

25 means an oligonucleotide sequence that has at least about 80% identity, preferably at 
least about 90% and more preferably at least about 95%, identity to the polynucleotide in 
question, or an oligonucleotide sequence that is anti-sense to a sequence that has at least 
about 80% identity, preferably at least about 90% and more preferably at least about 
95%, identity to the polynucleotide in question. Primers and/or probes which may be 

30 usefully employed in the inventive diagnostic methods preferably have at least about 10- 
40 nucleotides. In a preferred embodiment, the polymerase chain reaction primers 
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comprise at least about 10 contiguous nucleotides of a polynucleotide that encodes one of 
the polypeptides disclosed herein or that is anti-sense to a sequence that encodes one of 
the polypeptides disclosed herein. Preferably, oligonucleotide probes for use in the 
inventive diagnostic methods comprise at least about 15 contiguous oligonucleotides of a 
5 polynucleotide that encodes one of the polypeptides disclosed herein or that is anti-sense 
to a sequence that encodes one of the polypeptides disclosed herein. Techniques for both 
PCR based assays and in situ hybridization assays are well known in the art. 

Conventional RT-PCR protocols using agarose and ethidium bromide 
staining, while important in defining gene specificity, do not lend themselves to 

10 diagnostic kit development because of the time and effort required in making them 
quantitative (i.e., construction of saturation and/or titration curves), and their sample 
throughput. This problem is overcome by the development of procedures such as real 
time RT-PCR which allows for assays to be performed in single tubes, and in turn can be 
modified for use in 96 well plate formats. Instrumentation to perform such 

15 methodologies are available from Perkin Elmer/Applied Biosystems Division. 
Alternatively, other high throughput assays using labeled probes (e.g., digoxygenin) in 
combination with labeled (e.g., enzyme fluorescent, radioactive) antibodies to such 
probes can also be used in the development of 96 well plate assays. 

In yet another method for determining the presence or absence of breast 

20 cancer in a patient, one or more of the breast tumor-specific polypeptides described may 
be used in a skin test. As used herein, a "skin test" is any assay performed directly on a 
patient in which a delayed-type hypersensitivity (DTH) reaction (such as swelling, 
reddening or dermatitis) is measured following intradermal injection of one or more 
polypeptides as described above. Such injection may be achieved using any suitable 

25 device sufficient to contact the polypeptide or polypeptides with dermal cells of the 
patient, such as a tuberculin syringe or 1 mL syringe. Preferably, the reaction is 
measured at least 48 hours after injection, more preferably 48-72 hours. 

The DTH reaction is a cell-mediated immune response, which is greater in 
patients that have been exposed previously to a test antigen {i.e., an immunogenic portion 

30 of a polypeptide employed, or a variant thereof). The response may measured visually, 
using a ruler. In general, a response that is greater than about 0.5 cm in diameter, 
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preferably greater than about 5.0 cm in diameter, is a positive response, indicative of 
breast cancer. 

The breast tumor-specific polypeptides described herein are preferably 
formulated, for use in a skin test, as pharmaceutical compositions containing at least one 
polypeptide and a physiologically acceptable carrier, such as water, saline, alcohol, or a 
buffer. Such compositions typically contain one or more of the above polypeptides in an 
amount ranging from about 1 jag to 100 ng, preferably from about 10 jag to 50 jig in a 
volume of 0.1 mL. Preferably, the carrier employed in such pharmaceutical 
compositions is a saline solution with appropriate preservatives, such as phenol and/or 
Tween 80™. 

In other aspects of the present invention, the progression and/or response 
to treatment of a breast cancer may be monitored by performing any of the above assays 
over a period of time, and evaluating the change in the level of the response (i.e., the 
amount of polypeptide or mRNA detected or, in the case of a skin test, the extent of the 
immune response detected). For example, the assays may be performed every month to 
every other month for a period of 1 to 2 years. In general, breast cancer is progressing in 
those patients in whom the level of the response increases over time. In contrast, breast 
cancer is not progressing when the signal detected either remains constant or decreases 
with time. 

In further aspects of the present invention, the compounds described 
herein may be used for the immunotherapy of breast cancer. In these aspects, the 
compounds (which may be polypeptides, antibodies or polynucleotides) are preferably 
incorporated into pharmaceutical compositions or vaccines. Pharmaceutical 
compositions comprise one or more such compounds and a physiologically acceptable 
carrier. Vaccines may comprise one or more such compounds in combination with an 
immunostimulant, such as an adjuvant or a liposome (into which the compound is 
incorporated). An immunostimulant may be any substance that enhances or potentiates 
an immune response (antibody and/or cell-mediated) to an exogenous antigen. Examples 
of immunostimulants include adjuvants, biodegradable microspheres (e.g., polylactic 
galactide) and liposomes (into which the compound is incorporated; see e.g., Fullerton, 
U.S. Patent No. 4,235,877). Vaccine preparation is generally described in, for example, 
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M.F. Powell and M.J. Newman, eds., "Vaccine Design (the subunit and adjuvant 
approach)," Plenum Press (NY, 1995). Pharmaceutical compositions and vaccines 
within the scope of the present invention may also contain other compounds, which may 
be biologically active or inactive. For example, one or more immunogenic portions of 

5 other tumor antigens may be present, either incorporated into a fusion polypeptide or as a 
separate compound, within the composition or vaccine. 

Alternatively, a vaccine may contain DNA encoding one or more of the 
polypeptides as described above, such that the polypeptide is generated in situ. In such 
vaccines, the DNA may be present within any of a variety of delivery systems known to 

0 those of ordinary skill in the art, including nucleic acid expression systems, bacteria and 
viral expression systems. Appropriate nucleic acid expression systems contain the 
necessary DNA sequences for expression in the patient (such as a suitable promoter and 
terminating signal). Bacterial delivery systems involve the administration of a bacterium 
(such as Bacillus-Calmette-Guerrin) that expresses an immunogenic portion of the 

5 polypeptide on its cell surface. In a preferred embodiment, the DNA may be introduced 
using a viral expression system {e.g., vaccinia or other pox virus, retrovirus, or 
adenovirus), which may involve the use of a non-pathogenic (defective), replication 
competent virus. Techniques for incorporating DNA into such expression systems are 
well known to those of ordinary skill in the art. The DNA may also be "naked," as 

) described, for example, in Ulmer et al., Science 259: 1745- 1749 (1993), and reviewed by 
Cohen, Science 259:1691-1692 (1993). The uptake of naked DNA may be increased by 
coating the DNA onto biodegradable beads, which are efficiently transported into the 
cells. 

While any suitable carrier known to those of ordinary skill in the art may 
be employed in the pharmaceutical compositions of this invention, the type of carrier will 
vary depending on the mode of administration. For parenteral administration, such as 
subcutaneous injection, the carrier preferably comprises water, saline, alcohol, a fat, a 
wax or a buffer. For oral administration, any of the above carriers or a solid carrier, such 
as mannitol, lactose, starch, magnesium stearate, sodium saccharine, talcum, cellulose, 
glucose, sucrose, and magnesium carbonate, may be employed. Biodegradable 
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microspheres (e.g., polylactate polyglycolate) may also be employed as carriers for the 
pharmaceutical compositions of this invention. 

Any of a variety of immunostimulants may be employed in the vaccines 
of this invention. For example, an adjuvant may be included. Most adjuvants contain a 
5 substance designed to protect the antigen from rapid catabolism, such as aluminum 
hydroxide or mineral oil, and a stimulator of immune responses, such as lipid A, 
Bortadella pertussis or Mycobacterium tuberculosis derived proteins. Suitable adjuvants 
are commercially available as, for example, Freund's Incomplete Adjuvant and Complete 
Adjuvant (Difco Laboratories, Detroit, MI); Merck Adjuvant 65 (Merck and Company, 
10 Inc., Rahway, NJ); AS-2 (SmithKline Beecham, Philadelphia, PA); aluminum salts such 
as aluminum hydroxide gel (alum) or aluminum phosphate; salts of calcium, iron or zinc; 
an insoluble suspension of acylated tyrosine; acylated sugars; cationically or anionically 
derivatized polysaccharides; polyphosphazenes; biodegradable microspheres; 
monophosphoryl lipid A and quil A. Cytokines, such as GM-CSF or interleukin-2, -7, or 
15 -12, may also be used as adjuvants. 

Within the vaccines provided herein, the adjuvant composition is 
preferably designed to induce an immune response predominantly of the Thl type. High 
levels of Thl-type cytokines (e.g., IFN-y, TNFa, IL-2 and IL-12) tend to favor the 
induction of cell mediated immune responses to an administered antigen. In contrast, 
20 high levels of Th2-type cytokines (e.g., IL-4, IL-5, IL-6 and IL-10) tend to favor the 
induction of humoral immune responses. Following application of a vaccine as provided 
herein, a patient will support an immune response that includes Thl- and Th2-type 
responses. Within a preferred embodiment, in which a response is predominantly Thl- 
type, the level of Thl-type cytokines will increase to a greater extent than the level of 
25 Th2-type cytokines. The levels of these cytokines may be readily assessed using 
standard assays. For a review of the families of cytokines, see Mosmann and Coffman, 
Ann. Rev. Immunol. 7:145-173, 1989. 

Preferred adjuvants for use in eliciting a predominantly Thl-type response 
include, for example, a combination of monophosphoryl lipid A, preferably 3-de-O- 
30 acylated monophosphoryl lipid A (3D-MPL), together with an aluminum salt. MPL 
adjuvants are available from Corixa Corporation (Seattle, WA; see US Patent Nos. 
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4,436,727; 4,877,611; 4,866,034 and 4,912,094). CpG-containing oligonucleotides (in 
which the CpG dinucleotide is unmethylated) also induce a predominantly Thl response. 
Such oligonucleotides are well known and are described, for example, in WO 96/02555 
and WO 99/33488. Immunostimulatory DNA sequences are also described, for example, 

5 by Sato et al., Science 273:352, 1996. Another preferred adjuvant is a saponin, 
preferably QS21 (Aquila Biopharmaceuticals Inc., Framingham, MA), which may be 
used alone or in combination with other adjuvants. For example, an enhanced system 
involves the combination of a monophosphoryl lipid A and saponin derivative, such as 
the combination of QS21 and 3D-MPL as described in WO 94/00153, or a less 

10 reactogenic composition where the QS21 is quenched with cholesterol, as described in 
WO 96/33739. Other preferred formulations comprise an oil-in-water emulsion and 
tocopherol. A particularly potent adjuvant formulation involving QS21, 3D-MPL and 
tocopherol in an oil-in-water emulsion is described in WO 95/17210. 

Other preferred adjuvants include Montanide ISA 720 (Seppic, France), 

15 SAF (Chiron, California, United States), ISCOMS (CSL), MF-59 (Chiron), the SBAS 
series of adjuvants (e.g., SBAS-2 or SBAS-4, available from SmithKline Beecham, 
Rixensart, Belgium), Detox (Ribi ImmunoChem Research Inc., Hamilton, MT), RC-529 
(Ribi ImmunoChem Research Inc., Hamilton, MT) and Aminoalkyl glucosaminide 4- 
phosphates (AGPs). 

20 Any vaccine provided herein may be prepared using well known methods 

that result in a combination of antigen, immunostimulant and a suitable carrier or 
excipient. The compositions described herein may be administered as part of a sustained 
release formulation (i.e., a formulation such as a capsule, sponge or gel (composed of 
polysaccharides, for example) that effects a slow release of compound following 

25 administration). Such formulations may generally be prepared using well known 
technology (see, e.g., Coombes et al., Vaccine 74:1429-1438, 1996) and administered by, 
for example, oral, rectal or subcutaneous implantation, or by implantation at the desired 
target site. Sustained-release formulations may contain a polypeptide, polynucleotide or 
antibody dispersed in a carrier matrix and/or contained within a reservoir surrounded by 

30 a rate controlling membrane. 

Carriers for use within such formulations are biocompatible, and may also 
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be biodegradable; preferably the formulation provides a relatively constant level of active 
component release. Such carriers include microparticles of poly(lactide-co-glycolide), as 
well as polyacrylate, latex, starch, cellulose and dextran. Other delayed-release carriers 
include supramolecular biovectors, which comprise a non-liquid hydrophilic core (e.g., a 
5 cross-linked polysaccharide or oligosaccharide) and, optionally, an external layer 
comprising an amphophilic compound, such as a phospholipid (see e.g., U.S. Patent No. 
5,151,254 and PCT applications WO 94/20078, WO/94/23701 and WO 96/06638). The 
amount of active compound contained within a sustained release formulation depends 
upon the site of implantation, the rate and expected duration of release and the nature of 
10 the condition to be treated or prevented. 

Any of a variety of delivery vehicles may be employed within 
pharmaceutical compositions and vaccines to facilitate production of an antigen-specific 
immune response that targets tumor cells. Delivery vehicles include antigen presenting 
cells (APCs), such as dendritic cells, macrophages, B cells, monocytes and other cells 
15 that may be engineered to be efficient APCs. Such cells may, but need not, be 
genetically modified to increase the capacity for presenting the antigen, to improve 
activation and/or maintenance of the T cell response, to have anti-tumor effects per se 
and/or to be immunologically compatible with the receiver (i.e., matched HLA 
haplotype). APCs may generally be isolated from any of a variety of biological fluids 
20 and organs, including tumor and peritumoral tissues, and may be autologous, allogeneic, 
syngeneic or xenogeneic cells. 

Certain preferred embodiments of the present invention use dendritic cells 
or progenitors thereof as antigen-presenting cells. Dendritic cells are highly potent APCs 
(Banchereau and Steinman, Nature 5P2:245-25J, 1998) and have been shown to be 
25 effective as a physiological adjuvant for eliciting prophylactic or therapeutic antitumor 
immunity (see Timmerman and Levy, Ann. Rev. Med. 30:507-529, 1999). In general, 
dendritic cells may be identified based on their typical shape (stellate in situ, with 
marked cytoplasmic processes (dendrites) visible in vitro), their ability to take up, 
process and present antigens with high efficiency and their ability to activate naive T cell 
30 responses. Dendritic cells may, of course, be engineered to express specific cell-surface 
receptors or ligands that are not commonly found on dendritic cells in vivo or ex vivo, 
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and such modified dendritic cells are contemplated by the present invention. As an 
alternative to dendritic cells, secreted vesicles antigen-loaded dendritic cells (called 
exosomes) may be used within a vaccine (see Zitvogel et al., Nature Med. 4:594-600, 
1998). 

5 Dendritic cells and progenitors may be obtained from peripheral blood, 

bone marrow, tumor-infiltrating cells, peritumoral tissues-infiltrating cells, lymph nodes, 
spleen, skin, umbilical cord blood or any other suitable tissue or fluid. For example, 
dendritic cells may be differentiated ex vivo by adding a combination of cytokines such 
as GM-CSF, IL-4, IL-13 and/or TNFa to cultures of monocytes harvested from 
10 peripheral blood. Alternatively, CD34 positive cells harvested from peripheral blood, 
umbilical cord blood or bone marrow may be differentiated into dendritic cells by adding 
to the culture medium combinations of GM-CSF, IL-3, TNFa, CD40 ligand, LPS, flt3 
ligand and/or other compound(s) that induce differentiation, maturation and proliferation 
of dendritic cells. 

15 Dendritic cells are conveniently categorized as "immature" and "mature" 

cells, which allows a simple way to discriminate between two well characterized 
phenotypes. However, this nomenclature should not be construed to exclude all possible 
intermediate stages of differentiation. Immature dendritic cells are characterized as APC 
with a high capacity for antigen uptake and processing, which correlates with the high 

20 expression of Fey receptor and mannose receptor. The mature phenotype is typically 
characterized by a lower expression of these markers, but a high expression of cell 
surface molecules responsible for T cell activation such as class I and class II MHC, 
adhesion molecules (e.g., CD54 and CD11) and costimulatory molecules (e.g., CD40, 
CD80, CD86and 4-1BB). 

25 APCs may generally be transfected with a polynucleotide encoding a 

polypeptide of the present invention (or portion or other variant thereof) such that the 
polypeptide, or an immunogenic portion thereof, is expressed on the cell surface. Such 
transfection may take place ex vivo, and a composition or vaccine comprising such 
transfected cells may then be used for therapeutic purposes, as described herein. 

30 Alternatively, a gene delivery vehicle that targets a dendritic or other antigen presenting 
cell may be administered to a patient, resulting in transfection that occurs in vivo. In vivo 
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and ex vivo transfection of dendritic cells, for example, may generally be performed 
using any methods known in the art, such as those described in WO 97/24447, or the 
gene gun approach described by Mahvi et al., Immunology and cell Biology 75:456-460, 
1997. Antigen loading of dendritic cells may be achieved by incubating dendritic cells or 

5 progenitor cells with the polypeptide, DNA (naked or within a plasmid vector) or RNA; 
or with antigen-expressing recombinant bacterium or viruses {e.g., vaccinia, fowlpox, 
adenovirus or lentivirus vectors). Prior to loading, the polypeptide may be covalently 
conjugated to an immunological partner that provides T cell help (e.g., a carrier 
molecule). Alternatively, a dendritic cell may be pulsed with a non-conjugated 

10 immunological partner, separately or in the presence of the polypeptide. 

Vaccines and pharmaceutical compositions may be presented in unit-dose 
or multi-dose containers, such as sealed ampoules or vials. Such containers are 
preferably hermetically sealed to preserve sterility of the formulation until use. In 
general, formulations may be stored as suspensions, solutions or emulsions in oily or 

15 aqueous vehicles. Alternatively, a vaccine or pharmaceutical composition may be stored 
in a freeze-dried condition requiring only the addition of a sterile liquid carrier 
immediately prior to use. 

The above pharmaceutical compositions and vaccines may be used, for 
example, for the therapy of breast cancer in a patient. As used herein, a "patient" refers 

20 to any warm-blooded animal, preferably a human. A patient may or may not be afflicted 
with breast cancer. Accordingly, the above pharmaceutical compositions and vaccines 
may be used to prevent the development of breast cancer or to treat a patient afflicted 
with breast cancer. In a preferred embodiment, the compounds are administered either 
prior to or following surgical removal of primary tumors and/or treatment by 

25 administration of radiotherapy and conventional chemotherapeutic drugs. To prevent or 
slow the development of breast cancer, a pharmaceutical composition or vaccine 
comprising one or more polypeptides as described herein may be administered to a 
patient. Alternatively, naked DNA or plasmid or viral vector encoding the polypeptide 
may be administered. For treating a patient with breast cancer, the pharmaceutical 

30 composition or vaccine may comprise one or more polypeptides, antibodies or 
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polynucleotides complementary to DNA encoding a polypeptide as described herein 

(e.g., antisense RNA or antisense deoxyribonucleotide oligonucleotides). 

Routes and frequency of administration, as well as dosage, will vary from 

individual to individual. In general, the pharmaceutical compositions and vaccines may 

5 be administered by injection (e.g., intracutaneous, intramuscular, intravenous or 
subcutaneous), intranasally (e.g., by aspiration) or orally. Between 1 and 10 doses may 
be administered for a 52-week period. Preferably, 6 doses are administered, at intervals 
of 1 month, and booster vaccinations may be given periodically thereafter. Alternate 
protocols may be appropriate for individual patients. A suitable dose is an amount of a 

10 compound that, when administered as described above, is capable of promoting an anti- 
tumor immune response. Such response can be monitored by measuring the anti-tumor 
antibodies in a patient or by vaccine-dependent generation of cytolytic effector cells 
capable of killing the patient's tumor cells in vitro. Such vaccines should also be capable 
of causing an immune response that leads to an improved clinical outcome (e.g., more 

15 frequent remissions, complete or partial or longer disease-free survival) in vaccinated 
patients as compared to non-vaccinated patients. In general, for pharmaceutical 
compositions and vaccines comprising one or more polypeptides, the amount of each 
polypeptide present in a dose ranges from about 100 \ig to 5 mg. Suitable dose sizes will 
vary with the size of the patient, but will typically range from about 0.1 mL to about 5 

20 mL. 

Polypeptides disclosed herein may also be employed in adoptive 
immunotherapy for the treatment of cancer. Adoptive immunotherapy may be broadly 
classified into either active or passive immunotherapy. In active immunotherapy, 
treatment relies on the in vivo stimulation of the endogenous host immune system to 

25 react against tumors with the administration of immune response-modifying agents (for 
example, tumor vaccines, bacterial adjuvants, and/or cytokines). 

In passive immunotherapy, treatment involves the delivery of biologic 
reagents with established tumor-immune reactivity (such as effector cells or antibodies) 
that can directly or indirectly mediate antitumor effects and does not necessarily depend 

30 on an intact host immune system. Examples of effector cells include T lymphocytes (for 
example, CD8+ cytotoxic T-lymphocyte, CD4+ T-helper, tumor-infiltrating 
lymphocytes), killer cells (Natural Killer cells, lymphokine-activated killer cells), B 
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cells, or antigen presenting cells (such as dendritic cells and macrophages) expressing the 
disclosed antigens. The polypeptides disclosed herein may also be used to generate 
antibodies or anti-idiotypic antibodies (as in U.S. Patent No. 4,918,164), for passive 
immunotherapy. 

5 The predominant method of procuring adequate numbers of T-cells for 

adoptive immunotherapy is to grow immune T-cells in vitro. Culture conditions for 
expanding single antigen-specific T-cells to several billion in number with retention of 
antigen recognition in vivo are well known in the art. These in vitro culture conditions 
typically utilize intermittent stimulation with antigen, often in the presence of cytokines, 

10 such as IL-2, and non-dividing feeder cells. As noted above, the immunoreactive 
polypeptides described herein may be used to rapidly expand antigen-specific T cell 
cultures in order to generate sufficient number of cells for immunotherapy. In particular, 
antigen-presenting cells, such as dendritic, macrophage or B-cells, may be pulsed with 
immunoreactive polypeptides or transfected with a polynucleotide sequence(s), using 

15 standard techniques well known in the art. For cultured T-cells to be effective in 
therapy, the cultured T-cells must be able to grow and distribute widely and to survive 
long term in vivo. Studies have demonstrated that cultured T-cells can be induced to 
grow in vivo and to survive long term in substantial numbers by repeated stimulation 
with antigen supplemented with IL-2 (see, for example, Cheever et al. Ibid). 

20 The polypeptides disclosed herein may also be employed to generate 

and/or isolate tumor-reactive T-cells, which can then be administered to the patient. In 
one technique, antigen-specific T-cell lines may be generated by in vivo immunization 
with short peptides corresponding to immunogenic portions of the disclosed 
polypeptides. The resulting antigen specific CD8+ CTL clones may be isolated from the 

25 patient, expanded using standard tissue culture techniques, and returned to the patient. 

Alternatively, peptides corresponding to immunogenic portions of the 
polypeptides may be employed to generate tumor reactive T cell subsets by selective in 
vitro stimulation and expansion of autologous T cells to provide antigen-specific T cells 
which may be subsequently transferred to the patient as described, for example, by 

30 Chang et al. (Crit Rev. Oncol. HematoL, 22(3), 213, 1996). 

In another embodiment, syngeneic or autologous dendritic cells may be 
pulsed with peptides corresponding to at least an immunogenic portion of a polypeptide 
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disclosed herein. The resulting antigen-specific dendritic cells may either be transferred 
into a patient, or employed to stimulate T cells to provide antigen-specific T cells which 
may, in turn, be administered to a patient. The use of peptide-pulsed dendritic cells to 
generate antigen-specific T cells and the subsequent use of such antigen-specific T cells 
5 to eradicate tumors in a murine model has been demonstrated by Cheever et al. 
("Therapy With Cultured T Cells: Principles Revisited, " Immunological Reviews, 
157:177, 1997). 

Additionally vectors expressing the disclosed polynucleotides may be introduced 
into stem cells taken from the patient and clonally propagated in vitro for autologous 

10 transplant back into the same patient. In one embodiment, cells of the immune system, 
such as T cells, may be isolated from the peripheral blood of a patient, using a 
commercially available cell separation system, such as CellPro Incorporated's (Bothell, 
WA) CEPRATE™ system (see U.S. Patent No. 5,240,856; U.S. Patent No. 5,215,926; 
WO 89/06280; WO 91/16116 and WO 92/07243). The separated cells are stimulated 

15 with one or more of the immunoreactive polypeptides contained within a delivery 
vehicle, such as a microsphere, to provide antigen-specific T cells. The population of 
tumor antigen-specific T cells is then expanded using standard techniques and the cells 
are administered back to the patient. 

20 The following Examples are offered by way of illustration and not by way 

of limitation. 

EXAMPLES 

25 EXAMPLE 1 

Preparation of Breast Tumor-Specific cDNAs Using 
Differential Display RT-PCR 

This Example illustrates the preparation of cDNA molecules encoding 
30 breast tumor-specific polypeptides using a differential display screen. 
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A. Preparation of B18Agl cDNA and Characterization of mRNA Expression 

Tissue samples were prepared from breast tumor and normal tissue of a 
patient with breast cancer that was confirmed by pathology after removal from the 
patient. Normal RNA and tumor RNA was extracted from the samples and mRNA was 

5 isolated and converted into cDNA using a (dT) 12 AG (SEQ ID NO: 130) anchored 3' 
primer. Differential display PCR was then executed using a randomly chosen primer 
(CTTCAACCTC) (SEQ ID NO: 103). Amplification conditions were standard buffer 
containing 1.5 mM MgCl 2 , 20 pmol of primer, 500 pmol dNTP, and 1 unit of Taq DNA 
polymerase (Perkin-Elmer, Branchburg, NJ). Forty cycles of amplification were 

10 performed using 94°C denaturation for 30 seconds, 42°C annealing for 1 minute, and 72° 
C extension for 30 seconds. An RNA fingerprint containing 76 amplified products was 
obtained. Although the RNA fingerprint of breast tumor tissue was over 98% identical to 
that of the normal breast tissue, a band was repeatedly observed to be specific to the 
RNA fingerprint pattern of the tumor. This band was cut out of a silver stained gel, 

15 subcloned into the T-vector (Novagen, Madison, WI) and sequenced. 

The sequence of the cDNA, referred to as B18Agl, is provided in SEQ ID 
NO:l. A database search of GENBANK and EMBL revealed that the B18Agl fragment 
initially cloned is 77% identical to the endogenous human retroviral element S71, which 
is a truncated retroviral element homologous to the Simian Sarcoma Virus (SSV). S71 

20 contains an incomplete gag gene, a portion of the pol gene and an LTR-like structure at 
the 3' terminus {see Werner et al, Virology 774:225-238 (1990)). B18Agl is also 64% 
identical to SSV in the region corresponding to the P30 (gag) locus. B18Agl contains 
three separate and incomplete reading frames covering a region which shares 
considerable homology to a wide variety of gag proteins of retroviruses which infect 

25 mammals. In addition, the homology to S71 is not just within the gag gene, but spans 
several kb of sequence including an LTR. 

B18Agl -specific PCR primers were synthesized using computer analysis 
guidelines. RT-PCR amplification (94°C, 30 seconds; 60°C -> 42°C, 30 seconds; 72°C, 
30 seconds for 40 cycles) confirmed that B18Agl represents an actual mRNA sequence 

30 present at relatively high levels in the patient's breast tumor tissue. The primers used in 
amplification were B18Agl-l (CTG CCT GAG CCA CAA ATG) (SEQ ID NO:128) and 
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B18Agl-4 (CCG GAG GAG GAA GCT AGA GGA ATA) (SEQ ID NO:129) at a 
3.5 mM magnesium concentration and a pH of 8.5, and B18Agl-2 (ATG GCT ATT TTC 
GGG GCC TGA CA) (SEQ ID NO:126) and B18Agl-3 (CCG GTA TCT CCT CGT 
GGG TAT T) (SEQ ID NO:127) at 2 mM magnesium at pH 9.5. The same experiments 

5 showed exceedingly low to nonexistent levels of expression in this patient's normal 
breast tissue (see Figure 1). RT-PCR experiments were then used to show that B18Agl 
mRNA is present in nine other breast tumor samples (from Brazilian and American 
patients) but absent in, or at exceedingly low levels in, the normal breast tissue 
corresponding to each cancer patient. RT-PCR analysis has also shown that the B18Agl 

10 transcript is not present in various normal tissues (including lymph node, myocardium 
and liver) and present at relatively low levels in PBMC and lung tissue. The presence of 
B18Agl mRNA in breast tumor samples, and its absence from normal breast tissue, has 
been confirmed by Northern blot analysis, as shown in Figure 2. 

The differential expression of B18Agl in breast tumor tissue was also 

15 confirmed by RNase protection assays. Figure 3 shows the level of B18Agl mRNA in 
various tissue types as determined in four different RNase protection assays. Lanes 1-12 
represent various normal breast tissue samples, lanes 13-25 represent various breast 
tumor samples; lanes 26-27 represent normal prostate samples; lanes 28-29 represent 
prostate tumor samples; lanes 30-32 represent colon tumor samples; lane 33 represents 

20 normal aorta; lane 34 represents normal small intestine; lane 35 represents normal skin, 
lane 36 represents normal lymph node; lane 37 represents normal ovary; lane 38 
represents normal liver; lane 39 represents normal skeletal muscle; lane 40 represents a 
first normal stomach sample, lane 41 represents a second normal stomach sample; lane 
42 represents a normal lung; lane 43 represents normal kidney; and lane 44 represents 

25 normal pancreas. Interexperimental comparison was facilitated by including a positive 
control RNA of known (3-actin message abundance in each assay and normalizing the 
results of the different assays with respect to this positive control 

RT-PCR and Southern Blot analysis has shown the B18Agl locus to be 
present in human genomic DNA as a single copy endogenous retroviral element. A 

30 genomic clone of approximately 12-18 kb was isolated using the initial B18Agl 
sequence as a probe. Four additional subclones were also isolated by Xbal digestion. 
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Additional retroviral sequences obtained from the ends of the Xbal digests of these 
clones (located as shown in Figure 4) are shown as SEQ ID NO:3 - SEQ ID NO: 10, 
where SEQ ID NO:3 shows the location of the sequence labeled 10 in Figure 4, SEQ ID 
NO:4 shows the location of the sequence labeled 11-29, SEQ ID NO:5 shows the 
5 location of the sequence labeled 3, SEQ ID NO:6 shows the location of the sequence 
labeled 6, SEQ ID NO:7 shows the location of the sequence labeled 12, SEQ ID NO:8 
shows the location of the sequence labeled 13, SEQ ID NO:9 shows the location of the 
sequence labeled 14 and SEQ ID NO: 10 shows the location of the sequence labeled 11- 
22. 

10 Subsequent studies demonstrated that the 12-18 kb genomic clone 

contains a retroviral element of about 7.75 kb, as shown in Figures 5A and 5B. The 
sequence of this retroviral element is shown in SEQ ID NO: 141. The numbered line at 
the top of Figure 5 A represents the sense strand sequence of the retroviral genomic clone. 
The box below this line shows the position of selected restriction sites. The arrows 

15 depict the different overlapping clones used to sequence the retroviral element. The 
direction of the arrow shows whether the single-pass subclone sequence corresponded to 
the sense or anti-sense strand. Figure 5B is a schematic diagram of the retroviral element 
containing B18Agl depicting the organization of viral genes within the element. The 
open boxes correspond to predicted reading frames, starting with a methionine, found 

20 throughout the element. Each of the six likely reading frames is shown, as indicated to 
the left of the boxes, with frames 1-3 corresponding to those found on the sense strand. 

Using the cDNA of SEQ ID NO:l as a probe, a longer cDNA was 
obtained (SEQ ID NO:227) which contains minor nucleotide differences (less than 1%) 
compared to the genomic sequence shown in SEQ ID NO: 141. 

25 B. Preparation of cDNA Molecules Encoding Other Breast Tumor-Specific 
Polypeptides 

Normal RNA and tumor RNA was prepared and mRNA was isolated and 
converted into cDNA using a (dT) 12 AG anchored 3' primer, as described above. 
Differential display PCR was then executed using the randomly chosen primers of SEQ 
30 ID NO: 87-125. Amplification conditions were as noted above, and bands observed to 
be specific to the RNA fingerprint pattern of the tumor were cut out of a silver stained 
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gel, subcloned into either the T-vector (Novagen, Madison, WI) or the pCRII vector 
(Invitrogen, San Diego, CA) and sequenced. The sequences are provided in SEQ ID 
NO:l 1 - SEQ ID NO:86. Of the 79 sequences isolated, 67 were found to be novel (SEQ 
ID NO: 1 1-26 and 28-77) (see also Figures 6-20). 

5 An extended DNA sequence (SEQ ID NO: 290) for the antigen BlSAgl 

(originally identified partial sequence provided in SEQ ID NO: 27) was obtained in 
further studies. Comparison of the sequence of SEQ ID NO: 290 with those in the gene 
bank as described above, revealed homology to the known human p-A activin gene. 
Further studies led to the isolation of the full-length cDNA sequence for the antigen 

10 B21GT2 (also referred to as B311D; originally identified partial cDNA sequence 
provided in SEQ ID NO: 56). The full-length sequence is provided in SEQ ID NO: 307, 
with the corresponding amino acid sequence being provided in SEQ ID NO: 308. 
Further studies led to the isolation of a splice variant of B311D. The B311D clone of 
SEQ ID NO: 316 was sequenced and a XhoI/NotI fragment from this clone was gel 

15 purified and 32P-cDTP labeled by random priming for use as a probe for further 
screening to obtain additional B311D gene sequence. Two fractions of a human breast 
tumor cDNA bacterial library were screened using standard techniques. One of the 
clones isolated in this manner yielded additional sequence which includes a poly A+ tail. 
The determined cDNA sequence of this clone (referred to as B31 1D_BT1_1A) is 

20 provided in SEQ ID NO: 3 1 7. The sequences of SEQ ID NO: 3 1 6 and 3 1 7 were found to 
share identity over a 464 bp region, with the sequences diverging near the poly A+ 
sequence of SEQ ID NO: 317. 

Subsequent studies identified an additional 146 sequences (SEQ ID 
NOS:142-289), of which 115 appeared to be novel (SEQ ID NOS:142, 143, 146-152, 

25 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221- 
240, 243-245, 247, 250, 251, 253, 255, 257-266, 268, 269, 271-273, 275, 276, 278, 280, 
281, 284, 288 and 291). To the best of the inventors' knowledge none of the previously 
identified sequences have heretofore been shown to be expressed at a greater level in 
human breast tumor tissue than in normal breast tissue. 

30 In further studies, several different splice forms of the antigen BllAgl 

(also referred to as B305D) were isolated, with each of the various splice forms 
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containing slightly different versions of the BllAgl coding frame. Splice junction 
sequences define individual exons which, in various patterns and arrangements, make up 
the various splice forms. Primers were designed to examine the expression pattern of 
each of the exons using RT-PCR as described below. Each exon was found to show the 
5 same expression pattern as the original BllAgl clone, with expression being breast 
tumor-, normal prostate- and normal testis-specific. The determined cDNA sequences 
for the isolated protein coding exons are provided in SEQ ID NO: 292 : 298, respectively. 
The predicted amino acid sequences corresponding to the sequences of SEQ ID NO: 292 
and 298 are provided m SEQ ID NO: ^99^and 300. Additional studies using rapid 

10 ""limplification of cDNA ends (RACE), a 5' specific primer to one of the splice forms of 
BllAgl provided above and a breast adenocarcinoma, led to the isolation of three 
additional, related, splice forms referred to as isoforms B11C-15, B11C-8 and B11C- 
9,16. The determined cDNA sequences for these isoforms are provided in SEQ ID NO: 
301-303, with the corresponding predicted amino acid sequences being provided in SEQ 

15 ID NO: 304-306. 

In subsequent studies on B305D isoform A (cDNA sequence provided in 
SEQ ID NO: 292), the cDNA sequence (provi^^nT^^^^ 

contaSTan adHTffonalguanine residue at position 884, leading to a frameshift in the open 
reading frame. The determined DNA sequence of this ORF is provided in SEQ ID NO: 
20 314. This frameshift generates a protein sequence (provided in SEQ ID NO: 315) of 293 
amino acids that contains the C-terminal domain common to the other isoforms of 
B305D but that differs in the N-terminal region. 

EXAMPLE 2 

25 Preparation of B 1 8 AgI DNA from Human Genomic DNA 

This Example illustrates the preparation of B18Agl DNA by 
amplification from human genomic DNA. 

B18Agl DNA may be prepared from 250 ng human genomic DNA using 
30 20 pmol of B18Agl specific primers, 500 pmol dNTPS and 1 unit of Taq DNA 
polymerase (Perkin Elmer, Branchburg, NJ) using the following amplification 
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parameters: 94°C for 30 seconds denaturing, 30 seconds 60°C to 42°C touchdown 
annealing in 2°C increments every two cycles and 72°C extension for 30 seconds. The 
last increment (a 42°C annealing temperature) should cycle 25 times. Primers were 
selected using computer analysis. Primers synthesized were B18Agl-l, B18Agl-2, 
B18Agl-3, and B18Agl-4. Primer pairs that may be used are 1+3, 1+4, 2+3, and 2+4. 

Following gel electrophoresis, the band corresponding to B18Agl DNA 
may be excised and cloned into a suitable vector. 

EXAMPLE 3 

Preparation of B18Ag1 DNA from Breast Tumor cDNA 

This Example illustrates the preparation of B18Agl DNA by 
amplification from human breast tumor cDNA. 

First strand cDNA is synthesized from RNA prepared from human breast 
tumor tissue in a reaction mixture containing 500 ng poly A+ RNA, 200 pmol of the 
primer (T) 12 AG (i.e., TTT TTT TTT TTT AG) (SEQ ID NO: 130), IX first strand reverse 
transcriptase buffer, 6.7 mM DTT, 500 mmol dNTPs, and 1 unit AMV or MMLV 
reverse transcriptase (from any supplier, such as Gibco-BRL (Grand Island, NY)) in a 
final volume of 30 jxl. After first strand synthesis, the cDNA is diluted approximately 25 
fold and 1 [il is used for amplification as described in Example 2. While some primer 
pairs can result in a heterogeneous population of transcripts, the primers B18Agl-2 
(5'ATG GCT ATT TTC GGG GGC TGA CA) (SEQ ID NO: 126) and B18Agl-3 
(5'CCG GTA TCT CCT CGT GGG TAT T) (SEQ ID NO: 127) yield a single 151 bp 
amplification product. 

EXAMPLE 4 

Identification of B-cell and T-cell Epitopes of B18Ag1 



This Example illustrates the identification of B18Agl epitopes. 
The B18Agl sequence can be screened using a variety of computer 
algorithms. To determine B-cell epitopes, the sequence can be screened for 
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hydrophobicity and hydrophilicity values using the method of Hopp, Prog. Clin. Biol. 
Res. 7725:367-77 (1985) or, alternatively, Cease et al., J. Exp. Med. 764:1779-84 (1986) 
or Spouge et al., J. Immunol. 755:204-12 (1987). Additional Class II MHC (antibody or 
B-cell) epitopes can be predicted using programs such as AMPHI (e.g., Margalit et al., J. 
5 Immunol. 1 35:2213 (1987)) or the methods of Rothbard and Taylor (e.g., EMBO J. 7:93 
(1988)). 

Once peptides (15-20 amino acids long) are identified using these 
techniques, individual peptides can be synthesized using automated peptide synthesis 
equipment (available from manufacturers such as Perkin Elmer/Applied Biosystems 
10 Division, Foster City, CA) and techniques such as Merrifield synthesis. Following 
synthesis, the peptides can used to screen sera harvested from either normal or breast 
cancer patients to determine whether patients with breast cancer possess antibodies 
reactive with the peptides. Presence of such antibodies in breast cancer patient would 
confirm the immunogenicity of the specific B-cell epitope in question. The peptides can 
15 also be tested for their ability to generate a serologic or humoral immune in animals 
(mice, rats, rabbits, chimps etc.) following immunization in vivo. Generation of a 
peptide-specific antiserum following such immunization further confirms the 
immunogenicity of the specific B-cell epitope in question. 

To identify T-cell epitopes, the B18Agl sequence can be screened using 
20 different computer algorithms which are useful in identifying 8-10 amino acid motifs 
within the B18Agl sequence which are capable of binding to HLA Class I MHC 
molecules, (see, e.g., Rammensee et al., Immunogenetics 4 1 :178-228 (1995)). Following 
synthesis such peptides can be tested for their ability to bind to class I MHC using 
standard binding assays (e.g., Sette et al., J. Immunol. 753:5586-92 (1994)) and more 
25 importantly can be tested for their ability to generate antigen reactive cytotoxic T-cells 
following in vitro stimulation of patient or normal peripheral mononuclear cells using, 
for example, the methods of Bakker et al., Cancer Res. 55:5330-34 (1995); Visseren et 
al., J. Immunol. 754:3991-98 (1995); Kawakami et al., J. Immunol. 754:3961-68 (1995); 
and Kast et al., J. Immunol. 752:3904-12 (1994). Successful in vitro generation of T- 
30 cells capable of killing autologous (bearing the same Class I MHC molecules) tumor 
cells following in vitro peptide stimulation further confirms the immunogenicity of the 
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B18Agl antigen. Furthermore, such peptides may be used to generate murine peptide 
and B18Agl reactive cytotoxic T-cells following in vivo immunization in mice rendered 
transgenic for expression of a particular human MHC Class I haplotype (Vitiello et al., J. 
Exp. Med. 773:1007-15 (1991). 

A representative list of predicted B18Agl B-cell and T-cell epitopes, 
broken down according to predicted HLA Class I MHC binding antigen, is shown below: 

Predicted T h Motifs fB-cell epitopes') fSF.O TP NOS • 131-133) 
SSGGRTFDDFHRYLLVGI 
QGAAQKPINLSKXIEVVQGHDE 
SPGVFLEHLQEAYRIYTPFDLSA 



Predicted HLA A2.1 Moti fs CT-cell epitop es) rSFO TP MPS ■ 134-140) 
YLLVGIQGA 
GAAQKPINL 
NLSKXIEVV 
EVVQGHDES 
HLQEAYRIY 
NLAFVAQAA 
FVAQAAPDS 



EXAMPLE 5 
Identification of T-cell Epitopes of Bl 1 AgI 

This Example illustrates the identification of Bl lAgl (also referred to as 
B305D) epitopes. Four peptides, referred to as Bll-8, Bll-1, Bll-5 and Bl 1-12 (SEQ 
ID NO: 309-3 12, respectfully) were derived from the Bl 1 Agl gene. 

Human CD8 T cells were primed in vitro to the peptide Bll-8 using 
dendritic cells according to the protocol of Van Tsai et al. (Critical Reviews in 
Immunology 18:65-75, 1998). The resulting CD 8 T cell cultures were tested for their 
ability to recognize the Bll-8 peptide or a negative control peptide, presented by the B- 
LCL line, JY. Briefly, T cells were incubated with autologous monocytes in the presence 
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of 10 ug/ml peptide, 10 ng/ml IL-7 and 10 ug/ml IL-2, and assayed for their ability to 
specifically lyse target cells in a standard 51-Cr release assay. As shown in Fig. 22, the 
bulk culture line demonstrated strong recognition of the Bll-8 peptide with weaker 
recognition of the peptide Bl 1-1 . 

A clone from this CTL line was isolated following rapid expansion using 
the monoclonal antibody OKT3 and human IL-2. As shown in Fig. 23, this clone 
(referred to as Al), in addition to being able to recognize specific peptide, recognized JY 
LCL transduced with the Bl lAgl gene. This data demonstrates that Bl 1-8 is a naturally 
processed epitope of the BllAgl gene. In addition these T cells were further found to 
recognize and lyse, in an HLA-A2 restricted manner, an established tumor cell line 
naturally expressing BllAgl (Fig. 24). The T cells strongly recognize a lung 
adenocarcinoma (LT-140-22) naturally expressing Bl 1 Agl transduced with HLA-A2, as 
well as an A2+ breast carcinoma (CAMA-1) transduced with BllAgl, but not 
untransduced lines or another negative tumor line (SW620). 

These data clearly demonstrate that these human T cells recognize not 
only Bll-specific peptides but also transduced cells, as well as naturally expressing 
tumor lines. 

CTL lines raised against the antigens Bll-5 and Bll-12, using the 
procedures described above, were found to recognize corresponding peptide-coated 
targets. 
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Example 6 

Characterization of Breast Tumor Genes Discovered by 
Differential Display PCR 

5 The specificity and sensitivity of the breast tumor genes discovered by 

differential display PCR were determined using RT-PCR. This procedure enabled the 
rapid evaluation of breast tumor gene mRNA expression semiquantitatively without 
using large amounts of RNA. Using gene specific primers, mRNA expression levels in a 
variety of tissues were examined, including 8 breast tumors, 5 normal breasts, 2 prostate 
10 tumors, 2 colon tumors, 1 lung tumor, and 14 other normal adult human tissues, 
including normal prostate, colon, kidney, liver, lung, ovary, pancreas, skeletal muscle, 
skin, stomach and testes. 

To ensure the semiquantitative nature of the RT-PCR, p-actin was used as 
internal control for each of the tissues examined. Serial dilutions of the first strand 

15 cDNAs were prepared and RT-PCR assays performed using p-actin specific primers. A 
dilution was then selected that enabled the linear range amplification of p-actin template, 
and which was sensitive enough to reflect the difference in the initial copy number. 
Using this condition, the P-actin levels were determined for each reverse transcription 
reaction from each tissue. DNA contamination was minimized by DNase treatment and 

20 by assuring a negative result when using first strand cDNA that was prepared without 
adding reverse transcriptase. 

Using gene specific primers, the mRNA expression levels were 
determined in a variety of tissues. To date, 38 genes have been successfully examined by 
RT-PCR, five of which exhibit good specificity and sensitivity for breast tumors 

25 (B15AG-1, B31GAlb, B38GA2a, BllAla and B18AGla). Figures 21 A and 21B depict 
the results for three of these genes: B15AG-1 (SEQ ID NO:27), B31GAlb (SEQ ID 
NO: 148) and B38GA2a (SEQ ID NO. 157). Table I summarizes the expression level of 
all the genes tested in normal breast tissue and breast tumors, and also in other tissues. 



43 

TABLE I 

Percentage of Breast Cancer Antigens that are Expressed in Various Tissues 



Breast Tissues 



Over-expressed in Breast Tumors 
Equally Expressed in Normals and Tumor 



84% 
16% 



10 



15 



Over-expressed in Breast Tumors but 
not in any Normal Tissues 

Other Tissues Over-expressed in Breast Tumors but 

Expressed in Some Normal Tissues 

Over-expressed in Breast Tumors but 
Equally Expressed in All Other Tissues 



9% 



30% 



61% 



From the foregoing, it will be appreciated that, although specific 
embodiments of the invention have been described herein for the purpose of illustration, 
various modifications may be made without deviating from the spirit and scope of the 
invention. 




